A hybrid approach to online speaker diarization

نویسندگان

  • Carlos Vaquero
  • Oriol Vinyals
  • Gerald Friedland
چکیده

This article presents a low-latency speaker diarization system (“who is speaking now?”) based on a hybrid approach that combines a traditional offline speaker diarization system (“who spoke when?”) with an online speaker identification system. The system fulfills all requirements of the diarization task, i.e. it does not need any a-priori information about the input, including no specific speaker models. After an initialization phase the approach allows a low-latency decision on the current speaker with an accuracy that is close to the underlying offline diarization system. The article describes the approach, evaluates the robustness of the system, and analyzes the latency/accuracy trade-off.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semi-supervised On-line Speaker Diarization for Meeting Data with Incremental Maximum A-posteriori Adaptation

Almost all current diarization systems are off-line and illsuited to the growing need for on-line or real-time diarization. Our previous work reported the first on-line diarization system for the most challenging speaker diarization domain involving meeting data captured with a single distant microphone (SDM). Even if results were not dissimilar to those reported for online diarization in less ...

متن کامل

Online two speaker diarization

Short conversations pose some challenges for online diarization due to data sparseness and unbalanced representation of the two speakers. This paper presents our recent advances in online diarization of two-wire telephone conversations, introducing several methods for improving processing efficiency and accuracy on short conversations. Our framework is based on the offline diarization of a conv...

متن کامل

Using a GPU, Online Diarization = Offline Diarization

This article presents a low-latency, online speaker diarization system (“who is speaking now?”) based on the repeated execution of a GPU-optimized, highly efficient offline diarization system (“who spoke when”). The system fulfills all requirements of the diarization task, i.e., it does not require any a priori information about the input, including specific speaker models. In contrast to earli...

متن کامل

Confidence for Speaker Diarization using PCA Spectral Ratio

Confidence scoring is an important component in speaker diarization systems, both for offline speech analytics and for online diarization that are required to produce the speaker segmentation from very little audio. This paper proposes a confidence measure for speaker diarization based on the spectral ratio of the eigenvalues of the Principal Component Analysis (PCA) transformation computed on ...

متن کامل

Online Diarization of Telephone Conversations

Speaker diarization systems attempts to perform segmentation and labeling of a conversation between R speakers, while no prior information is given regarding the conversation. Diarization systems basically tries to answer the question ”Who spoke when?”. In order to perform speaker diarization, most state of the art diarization systems operate in an off-line mode, that is, all of the samples of ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010